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Summary. Estimation of extreme-value parameters from observations in the max-domain 
of attraction (MDA) of a multivariate max-stable distribution commonly uses aggregated data 
such as block maxima. Since we expect that additional information is contained in the non- 
aggregated, single "large" observations, we introduce a new approach of inference based on a 
multivariate peaks-over-threshold method. We show that for any process in the MDA of the fre- 
quently used Husler-Reiss model or its spatial extension, the Brown-Resnick process, suitably 
defined conditional increments asymptotically follow a multivariate Gaussian distribution. This 
leads to computationally efficient estimates of the Husler-Reiss parameter matrix. Further, the 
results enable parametric inference for Brown-Resnick processes. 

A simulation study compares the performance of the new estimators to other commonly used 
methods. As an application, we fit a non-isotropic Brown-Resnick process to the extremes of 
12 year data of daily wind speed measurements. 
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1. Introduction 



Univariate extreme value theory is concerned with the limits of linearly normalized maxima 



of i.i.d. observations, namely the max-stable distributions (cf. de Haan and Ferreira ( 2006 1) 



Statistical inference of the parameters is well-developed and usually based on one of the 
following two approaches. Maximum likelihood estimation is applied to blockwise maxima 
of the original data, where a typical block size in environmental applications is one year. 
On the other hand, the peaks- over-threshold (POT) method fits a suitable Poisson point 
process to all data that exceed a certain high threshold and thus follow approximately a 



generalized Pareto distribution (cf. Davison and Smith (1990)). The advantage of the latter 
approach is that it avoids discarding extreme values within the blocks that are below the 
maximum but nevertheless contain information on the parameters. 

When interested in the joint extreme behavior of multivariate quantities, there are different 
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possibilities of ordering the data, though, the most common procedure is taking compo- 
nentwise maxima. In multivariate extreme value theory, a random process {£(£) : t G T} 
with some index set T is called max-stable, if there exists a sequence (?7j)jeN of independent 
copies of a process {rj(t) : t £ T} and functions c n (t) > 0, b n (t) € K., n G N, such that the 
convergence 



= lim c n (t) (mkxr]i(t) - &„(£) 

n— >oo \ i— 1 



£ e T, 



(1) 



holds in the sense of finite dimensional distributions. In this case, the process rj is said 
to be in the max-domain of attraction (MDA) of £. Typically, T is a finite set or T = 
R d , d 6 N, for the multivariate or the spatial case, respectively. Both theory and inference 
are considerably more demanding than in the univariate framework due to the fact that 
no finite-dimensional parametric model captures every possible dependence structure of a 



multivariate max-stable distribution (cf. Resnick (2008)). Similarly to the univariate case, 



a standard approach for parameter estimation of the max-stable process £ from data in its 
MDA is via componentwise block maxima, which ignores much of the information contained 
in the original data. Moreover, even if the exact max-stable process is available, maximum 
likelihood (ML) estimation is problematic since typically only the bivariate densities of max- 
stable distributions are known in closed form. Composite likelihood (CL) approaches are 
common tools to avoid this difficulty (cf. Padoan et al. (2010), Davison and Gholamrezaee 
p012"l )). 

Only recently, multivariate POT methods have attracted increased attention. In contrast to 
the univariate case, the definition of exceedances over a certain threshold is ambiguous. For 



instance, Rootzen and Tajvidi (2006) define a multivariate generalized Pareto distribution 



(MGPD) as the limit distribution of some multivariate random vector in the MDA of a 
max-stable distribution, conditional on the event that at least one of the components is 
large. A simulation study in |Bacro and Gaetan] ( |2012[ ) shows, that these MGPD perform 
well in many situations, yet, again only bivariate densities in a CL framework are used since 
multivariate densities are unknown. Alternatively, exceedances can be defined as the event 



that the norm of the random vector is large, giving rise to the spectral measure (cf. Coles and 
Tawn ( 1991 )). Engelke et al. (2012 1 have recently proposed to condition a fixed component 



on exceeding a high threshold, which enables new methods of inference for processes that 
admit a certain incremental or a mixed moving maxima representation. 

With regard to practical application such as modeling extreme wind speed or precipita- 
tion data, max-stable models need to find a compromise between flexibility and tractability. 



There are several parametric families of multivariate extreme-value distributions (see Kotz 



and Nadarajah (2000)) and only few max-stable models in the spatial domain (cf. |de Haan 



and Pereira (20061; Schlather (2002); Smith (1990)). For most of them, statistical inference 
is difficult and time-intensive. Furthermore, except for the max-stable process £ in itself, 
usually no further processes ij in the MDA of attraction of £ are known and thus, it lacks a 
theoretical connection between modeling the daily processes rj and modeling the extremal 
process £. 

In many applications such as geostatistics it is natural to assume that the data is normally 
distributed. Under this assumption, the only possible non-trivial limit for extreme obser- 



vations is the d-variate Husler-Reiss distribution (cf. Hiisler and Reiss (1989); Kabluchko 
(2011)). In fact, Hashorva (2006) and Hashorva et al. (2012) show that also other dis- 



tributions are attracted by the Husler-Reiss distribution. Hence, we can expect good fits 
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of this model if the daily data is close to normality. Recently, it has been shown that 



the class of Brown-Resnick processes (Brown and Resnick (1977); Kabluchko et al. (2009)) 
constitutes the spatial analog of the Hiisler-Reiss distributions since the latter occur as 
finite-dimensional marginals of the Brown-Resnick process. The research on both theoreti- 



cal properties (cf. Dombry et al. (2011); Oesting et al. (2012) for simulation methods) and 



practical applications (e.g., Davison et al. (2012)) of these processes is actively ongoing 
at present. Statistical inference, however, was so far limited to the CL methods based on 
bivariate densities. 

In this paper, we propose new estimation methods based on a POT approach for data in 
the MDA of Hiisler-Reiss distributions and Brown-Resnick processes. Similarly to Engclke 
et al. (2012), we consider extremal increments, i.e., increments of the data with respect to a 



fixed component, conditional on the event, that this component is large. The great advan- 
tage of this approach is the fact that the extremal increments turn out to be multivariate 
Gaussian distributed. This enables, for instance, ML estimation with the full multivariate 
density function as well as parameter estimation based on functionals of the Gaussian dis- 
tribution. Moreover, the concept of extremal increments as well as estimators derived from 
spectral densities are shown to be suitable tools for fitting a Brown-Resnick process based 
on a parametric family of variograms. 

The remainder of the paper is organized as follows. Section [2] comprises the definitions 
and some general properties of Hiisler-Reiss distributions and Brown-Resnick processes. 
In Section [3j we provide a result on weak convergence of suitably transformed and con- 
ditioned variables in the MDA of the Hiisler-Reiss distribution, which is the basis for our 
estimation methods. It is used to derive the specific asymptotic distribution for extremal 



increments (Section |3.l[ ) and for conditioning in the spectral sense (Section 3.2). In both 
cases, non-parametric estimation as well as parametric fitting of Brown-Resnick processes 
are considered. A simulation study is presented in Section |1J which compares the perfor- 
mance of the different estimators from the preceding section. As an application, in Section[5] 
we analyze daily wind speed data from the Netherlands and use our new methods of infer- 
ence to model spatial extreme events. Proofs of the theoretical results can be found in the 
Appendix. 



2. Hiisler-Reiss distributions and Brown-Resnick processes 

In this section we briefly review some details on Hiisler-Reiss distributions and Brown- 
Resnick processes and define extremal coefficient functions as a dependence measure for 
max-stable processes. 



2.1. Husler-Reiss distributions 



The multivariate Hiisler-Reiss distribution was introduced in Husler and Reiss ( 1989 1 as 
the limit of suitably normalized Gaussian random vectors. Suppose that the correlation 
matrix £„ in the n-th row of a triangular array of (k + l)-variate, zero-mean, unit-variance 
Gaussian distributions satisfies 



A = lim log(n)(l • 1 T - £„) € T>, 



(2) 
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where 1 = (1, . . . , 1) T <E R k+1 and V C [0, oo) ( - k+1 ^ x ( k+1 ^ denotes the space of symmetric, 
strictly conditionally negative definite matrices 

v =\ Ki)o<i,j<fc = A e [0,oo)( fe+1 ' x ( fe+1 ) : x T Ax < for all x e R k+1 \ {0} s.t. 



E 



i=0 



Xi = 0, dij = djj, an — for all < i,j < k 



Then the normalized row-wise maxima converge to the (k + l)-variate Hiisler-Reiss distri- 
bution which is completely characterized by the matrix A. Note that (1 ■ 1 T — £„) auto- 
matically lies in T> if £„ is non-degenerate, n£l. For any matrix A = (Af j) 0<i . <fc € T>, 
define a family of positive definite matrices by 

^i,m(A) = 2 I A m m + A m m — ^m,,m, ) „ ,,i 

where Z runs over 1, . . . , k and m = (m , . . . , mi) with < m < • • • < TO; < fe. The 
distribution function of the {k + l)-dimensional Hiisler-Reiss distribution with standard 
Gumbel margins is then given by 

tf A (x)=cxp J2 h lmA (x mi ,...,x mi ) \ , xel w , (3) 

where 

hi, m ,A(vo,---,yi) = f s{(y i -z + 2X 2 mumQ ). =1 J*^ m (A)}e- z dz, 
Jyo 

for 1 < / < k and /io,m,A(y) = exp(— y) for m e {0, . . . ,k}. Furthermore, for q G N and 
\& £ R qxq positive definite, S( ■ \^) denotes the so-called survivor function of a g-dimensional 
normal random vector with mean vector and covariance matrix "J/, i.e., if Y ~ N(0,^f) 
and x € R q , then S^xI'I') = P (Yx > X\, . . . , Y q > x q ). In the bivariate case, the distribution 
function ^ simplifies to 

H A (x 1 y)=c X p{-e-^(^+ 11 ^-^ - e -y^(x+°^jY x, y € R, (4) 

where A = Aq i G [0, oo] parametrizes between independence and complete dependence for 
A = oo and A = 0, respectively. 

Note that the class of Hiisler-Reiss distributions is closed in the sense that the lower- 
dimensional margins of H\ are again Hiisler-Reiss distributed with parameter matrix con- 
sisting of the respective entries in A. Consequently, the distribution of the bivariate sub- 
vector of the z-th and j-th component only depends on the parameter Ajj. Thus, one can 
modify this parameter (subject to the restriction A g D) without affecting the other com- 



ponents. This flexibility was demanded in Cooley et al. (2010) as a desirable property of 



multivariate extreme value models that most models do not possess, unfortunately 

Remark 2.1. The k-variate Hiisler-Reiss distribution is usually given by its distribu- 
tion function H\. The density for k > 3 is rather complicated and involves multivariate 
integration. Hence, for maximum likelihood estimation based on block maxima, only the 
bivariate or sometimes the trivariate (cf. Genton et al. { 201 1]) ) densities are used in the 
framework of a composite likelihood approach. 
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2.2. Brown-Resnick processes 

For T = R d , d > 1, let {Y(t) : t G T} be a centered Gaussian process with stationary 
increments. Further, let j(t) = E(Y(t) - Y(0)) 2 and a 2 {t) = E(Y(t)) 2 be the variogram 
and the variance of Y, t £ M. d , respectively. Then, for a Poisson point process X^ieN 011 
K with intensity e~ u du and i.i.d. copies Yi ~ Y, i € N, the process 



^)=maxU i + y i (t)-a 2 (t)/2, 



f e 



(5) 



is max-stable, stationary and its distribution only depends on the variogram 7. For the 
special case where Y is a Brownian motion, the process £ was already introduced by [Brown 
and Resnick (19771. Its generalization in (J5j) is called Brown-Resnick process associated to 



the variogram 7 ( Kabluchko et al 



p09| ). Since any conditionally negative definite function 
can be used as variogram, Brown-Resnick processes constitute an extremely flexible class 
of max-stable random fields. Moreover, the subclass associated to the family of fractal 
variograms 7 a ,s( - ) = II ■ /s|| a ,a G (0, 2],s G (0, 00), arises as limits of pointwise maxima of 
suitably rescaled and normalized, independent, stationary and isotropic Gaussian random 
fields (cf. Kabluchko et al. (2009)). Here || • || denotes the Euclidean norm. The model 
by Smith ( 1990 ) is another frequently used special case of Brown-Resnick processes, which 
corresponds to the class of variograms ",(/;) = ||/i£ _1 /i||, for h G R d and an axbitraxy 



pdxd 



covariance matrix X G 

We remark that the finite-dimensional marginal distribution at locations to 
a Brown-Resnick process is the Hiisler-Reiss distribution Ha with A = (^(U—t 



,t k eR d of 
j)/4)o<i,j<fe- 



2.3. Extremal coefficient function 

Since, in general, covariances do not exist for extreme value distributed random vectors, 
other measures of dependence are usually considered, one of which being the extremal 
coefficient 6. For a bivariate max-stable random vector (Xi, X2) with identically distributed 
margins, 9 G [1,2] is determined by 

F(Xi <u,X 2 <u)= ¥{X X < u) 9 , 

for some (and hence all) u G K. The quantity 9 measures the degree of tail dependence 
with limit cases 9=1 and 9 = 2 corresponding to complete dependence and complete 
independence, respectively. For a stationary, max-stable process £ on K d , the extremal 
coefficient function 9(h) is defined as the extremal coefficient of (£(0), £(/&)), for h G K d 
( |Schlather and Tawn| p003| ). 



For the bivariate Hiisler-Reiss distribution Q we have H\(u,u) = exp (— 2$(A)e _n ) and 
thus, the extremal coefficient equals 9 = 2$ (A). Hence, for Hiisler-Reiss distributions, there 
is a one-to-one correspondence between the parameter A G [0, 00] and the set of extremal 
coefficients. Similarly, the extremal coefficient function of the Brown-Resnick process in ^ 
is given by 

9(h) = 2$(VtW/2), h G R d . (6) 



Since there are model-independent estimators for the extremal coefficient function, e.g., the 
madogram in Cooley et al. (2006), it is a common tool for model checking. 
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3. Estimation 

In this section, we propose new estimators for the parameter matrix A of the Hiisler-Reiss 
distribution and use them to fit Brown-Resnick processes based on a parametric family of 
variograms. We will consider both estimation based on extremal increments and estimation 
in the spectral domain. 

Suppose that X, = (Jff \ . . . ,x\ k ) ), i = 1, . . . , n, are independent copies of a random 
vector X G M. k+1 in the MDA of the Hiisler-Reiss distribution Ha with some parameter 
matrix A = (^ij)o<i,j<k G F>. Recall that H\ has standard Gumbel margins. Without loss 
of generality, we assume that X has standard exponential margins. Otherwise we could 
consider (Uq(X^), . . . , Uk(X- k ')), where Ui = — log(l — Fi), and Fj is the cumulative 



distribution function of the i-th marginal of X (cf. Prop. 5.15 in Resnick ( 2008[ )). In the 



sequel, we denote by X„ = X — logn and X i Tl = Xj — logn the rescaled data such that 
the empirical point process II „ = ^, =1 8%, converges in distribution to a Poisson point 

process II on E = [-co, oo) h+1 \ {—oo} with intensity measure oo, x] c ) = — log -Ha( x ) 
(Prop. 3.21 in |Resnick| p008| ) , as n — > oo. Based on this convergence of point processes, 



the following theorem provides the conditional distribution of those data which are extreme 
in some sense. 

Theorem 3.1. For m G N and a metric space S, let g : J$ fe+1 — > S be a measurable 
transformation of the data and assume that it satisfies the invariance property g(x + a ■ 1) = 
g(x) for any a £ M. and 1 = (1, . . . , 1) g M' c+1 . Further, let u(n) > 0, n 6 N, be a sequence 
of real numbers such that Ywin^^ u(n) / n = 0. Then, for all Borel sets B G B{S) and 
A £ B[E) bounded away from —oo, 



lim I 

n— f oo 



' {<?(X„) G B | X„ G A- log M (n)} = Q g , A (B), (7) 
for some probability measure Q 3t A on S. 

Remark 3.2. Note that due to the invariance property of g, the transformed data is 
independent of the rescaling, i.e. g(Xi in ) = g(X.i), for all i = 1, . . . , n, n G N. 

In the above theorem, u(n) only has to satisfy u(n)/n — > 0, as n tends to oo. How- 
ever, for practical applications it is advisable to choose u{n) in such a manner that also 
lim„_ s . 00 u{n) = oo, since this ensures that the cardinality of the index set of extremal ob- 
servations 

I A = {ie {!,■■■, n} : X i>n e A - logu(n)}, (8) 
tends to oo as n oo, almost surely. 



Theorem 3.1 implies that for all extreme events, the transformed data {g(Xi) : i G Ia} 
approximately follow the distribution Q 9 .a- Clearly, Q 9 .a depends on the choices for g and 
A and in the subsequent sections we encounter different possibilities for which the limit |7]) 
can be computed explicitly. Furthermore, if g and A are chosen suitably, the distribution 
Qg,A will still contain all information on the parameter matrix A. Our estimators will 
therefore be based on the set of transformed data {g(Ki) : i G Ia} and the knowledge of 
their asymptotic distribution Q 9 ,a- For instance, a maximum likelihood approach can be 
applied using the fact that II n converges to II. If, for a particular realization of the X^, 
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Ia = {h, ■ ■ ■ , ijv} for some N < n, ii, . . . ,ijv G {1, • • • , n}, and g(X;) = Si, i = 1, . . . , n, a 
canonical approach is to maximize the likelihood 

L S , A (A; si, . . . , s„) =P{|Ja| = JV, g^) G ds i} , j = l,...,N} 

N 

= P (\I A \ = JV) JJP {.9(X) G (fei, I X„ G A - log U (n)} . 

With the Poisson approximation J2?=i l{X ijn G A — log u(ra)} ~ Pois{/i(^4 — logu(n))} and 
the convergence Q we obtain 

L g , A (A; a u ..., s„) « exp{- A1 (A - log u (n))} MA - J] ,:,/s, ). (9) 

If the ML approach is unfeasible, estimation of A can also be based on other suitably chosen 
functionals of the conditional distribution of <?(X), for instance on the variance of g(X). 



3. 1. Inference based on extremal increments 

In this subsection, we apply Theorem |3. 1| with g mapping the data to its increments w.r.t. 
a fixed index, i.e., g : R k+1 — > R k , x H> Ax = (x^ — x^°\ . . . — x^). In particular, 
g satisfies the invariance property g(x + a ■ 1) = g(x) for any a G R. Consequently, our 
estimators are based on the incremental distribution of those data which are extreme in the 
sense specified by the set A. The following theorem provides the limiting distribution Q 9t A 
for two particular choices of A, namely A\ = (0, oo) x Mr and A2 = [— 00, 0] c . 

Theorem 3.3. Let X be in the MDA of with some A G T>, and suppose that the 
sequence u(n) is chosen as in Theorem 3.1 Then, we have the following convergences in 
distribution. 

(a) For k G N, 

(x^ -X^,...^^ -X^\X^ > -logn(n)) 4jV(M,£), n — > 00, 

where Af(M, £) denotes the multivariate normal distribution with mean vector M = 
— diag( v E'fc.(o l ....fe)(A))/2 and covariance matrix £ = ^fc,(o,...,jfe)(A). 

(b) For the bivariate case, i.e., k = 1, 

- X^\X^ >-logu(n) orX^ >-log«(n)) 4 Z, n -> 00, 
where Z is a real-valued random variable with density given by 



Here, $ and <f> denote the standard normal distribution function and density, respec- 
tively. 
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Remark 3.4. The positive definite matrix E = ^fc,(o,...,fc)(A) contains all information 
on A. In fact, the transformation 



A(E) 



recovers the matrix A = (A? -)o<i,j<fe 






diag(E) T 


diag(E) 


1 diag(E) T + diag(E)l T - 2E 


0<i.j<k- 





Based on the convergence results in Theorem |3.3| we propose various estimation pro- 
cedures for both multivariate Hiisler-Reiss distributions (non-parametric case) and Brown- 
Resnick processes with a parametrized family of variograms (parametric case). 



3.1.1. Non-parametric multivariate case 

For the likelihood based approach in ^ we first consider the extremal set Ai = (0, oo) : 
and put Ni = \Iax |- By part one of Theorem 3.3 we have 



logL(A;s 



1 5 " " * ? 71 / 



log < exp(— u(n)) 



u(n) N 



n 



>Af(A),E(A) l s i 3/ 



cx ^ log det S(A) + \ £ {( Sij - Af(A)) T E(A)- 1 (s lj - M(A))} , 

(10) 



where is the realization of AXj, i = 1, . . . , n and </>m(A),£(A) is the density of the nor- 
mal distribution with mean vector Af(A) = — diag^fe ,(o,...,fc)(A))/2 and covariance matrix 
E(A) = ^kJo fc)(A). The corresponding maximum likelihood estimator is given by 



Amle = arg min 

AgX> 



2 



1 Nl 

log det£(A) + -A/(A)) T S(A)- 1 (s !j . - M(A))} 



(11) 



Notice that for this particular choice of A, the asymptotic value of Pfll^J = Ni) does not 
depend on the parameter matrix A. Hence, this ML ansatz coincides with simply maximiz- 
ing the likelihood of the increments without considering the number of points exceeding the 



threshold. In the bivariate case, i.e., k = 1 and A\ = (0, oo) xl, (10 1 simplifies to 



NX 2 1 Nl 

logi(A;si,...,s„) oc +7VilogA+ ^534' 



(12) 



3=1 



and the minimizer of (12 1 can be given in explicit form 

1 



A MLE — 2 ^ ' 



\ 



1 



1 Nl 1 

^S <ax " )2 ) 



(13) 
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Staying in the bivariate case, for the choice Ai = [— oo,0] c , we put N2 — \Ia 2 \ an d by 
part two of Theorem |3.3| 



]<.AKA:.s, s„,„ - I,,, { ,xp( • -2<l>< A)cl a)) ( M (*M"))* gA (^ ) 



cx 2$(A)u(n) 



1 n 2 

\2 X! S | ' 



8A 2 



3=1 



3 = 1 



Numerical optimization can be applied to obtain the estimator 

N 2 9 



Amle2 = argmin { 2§(V9)u(n) 



1 N2 ) 



While the above likelihood-based estimators (except for (13)) require numerical opti- 
mization, the following approach is computationally much more efficie nt: A natural esti- 
mator for X = &k,(o fe)(A) € M. kxk based on the first part of Theorem 3.3 is given by the 

(*0 



empirical covariance £ of the extremal increments AXj = (X^ — , 
for i e J^j , i.e. 



X ( 0)) 



1 Wl 

^ = ^rE( Ax ^-A)(Ax 2j -A) 



A 4 = 



1 ^ 

— VAX,.. 



(14) 



By Remark 3.4 this also gives an estimator Ay a r = A(£) for the parameter matrix A, which 



we call the variance-based estimator. Apart from its simple form, another advantage of 



(14) is that £ is automatically a positive definite matrix and hence, Ay ar is conditionally 



negative definite and therefore a valid matrix for a (k + l)-variate Hiisler-Reiss distribution. 



Note that ( 14 ) is not the maximum likelihood estimator (MLE) for S since the mean of the 
conditional distribution of AX^ depends on the diagonal of S. The MLE of £ is instead given 



by optimizing (11) w.r.t. £, which, to our knowledge, does not admit a closed analytical 
form. 



Applying ( 14 ) with k = 1 yields the bivariate variance-based estimator 



Av " ~ 4JV, 



"rlX'-xf-A)*, 



1 Nl 1 

3=1 



(15) 



Since the mean of the extremal increments is also directly related to the parameter A, 
another sensible estimator might be 
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3.1.2. Parametric approach for Brown-Resnick processes 

Statistical inference for Brown-Resnick processes as in ^ is usually based on fitting a 
parametric variogram model {7,9 : 1? € 0}, 6 C R J , j £ N, to point estimates of the ex- 
tremal coefficient function Q based on the madogram. Alternatively, composite likelihood 
approaches are used in connection with block maxima of bivariate data ( Davison and Gho 



lamrezaee (2012)) 



Since for to, . . . , tk € K , the vector (£(to)j • • • > £(*fc)) with £ being a Brown-Resnick process 



associated to the variogram 7 
matrix 



[0, 00) is Husler-Reiss distributed with parameter 



A = (7(tj - £j)/4)o<i,j<fc, 
the above estimators enable parametric estimation of Brown-Resnick processes. In fact, 



replacing A in ( 1 1 ) by 



A(0) = (7^-^/4) 



0<i,j<k 



(17) 



leads to the ML estimator 

??MLE 



arg min{— logL(A(#); si, 

1?£0 



,s n )} 



with L as in (10). Note that, other than in classical extreme value statistics, here the use 



of higher dimensional densities is feasible and promises a gain in accuracy. 

Estimation of $ can also be based on any of the bivariate estimators A^ LE , Am LE2 , Ay ar , 

A^ can , or on the multivariate estimator Ay ar by "projecting" the latter matrix or the matrix 

consisting of all bivariate estimates onto the set of matrices {{jt>{ti — tj)/4)o<i,j<k-^ G @}> 

i.e., 



arg mm 



$PROJ 

can be any matrix norm 



A'- -7*(*i-*j)/4 



0<i,j<k 



(18) 



where 

Similar to Bacro and Gaetan (2012), the bivariate estimators can readily be used in a 



parametric composite likelihood framework. 



3.2. Inference based on spectral densities 

As at the beginning of Section[3j let Xj, i = 1, . . . , n, be a sequence of independent copies of 
X, already standardized to exponential margins, in the MDA of the max-stable distribution 
H^. Since we work in the spectral domain in this section, we will switch to standard Frechct 
margins with distribution function cxp(— 1/y), y > 0. More precisely, we consider the vectors 
Y = cxp(X) and Yj = exp(Xj), i = 1, . . . ,n, which are in the MDA of the Hiisler-Reiss 
distribution Ga(x) = H^(\ogx),x > 0, with standard Frechet margins. 
The most convenient tool to characterize the dependence structure of a multivariate extreme 
value distribution is via its spectral measure. To this end, let Y„ = Y/n and Yj .„ = Yj/n 
denote the rescaled data such that the point process P n = Y^i=i „ converges, as n —> 00, 
to a non-homogeneous Poisson point process P on [0,oo) fc+1 \ {0} with intensity measure 
^([0,x] c ) = — \ogG\(x). Transforming a vector x = (x , . . . ,x k ) e [0, oo) k+1 \ {0} to its 
pseudo-polar coordinates 



r = l|x||, w = r : x, 



(19) 
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for any norm || • || on 



we can rewrite v as a measure on (0, oo) x Sk, where Sk is the k- 



dimensional unit simplex Sk = {y > : = 1}. Namely, we have v(dx) = r~ 2 dr x M (dui) , 
where the measure M is called the spectral measure of G\ and embodies the dependence 
structure of the extremes. For our purposes, it is most convenient to choose the Li-norm, 

i-e-, ||x||i = Ei=o \ x i 
vq > 0, we obtain 



In this case, for the set A ro = {x e [0, oo) fc+1 \ {0} : ||x||i > r }, 



r^MiSk) 



(fc+1), 



(20) 



since the measure M satisfies J s u>i M(du>) = 1 for i = 0, . . . , k. Hence, the z/-measure of 
A ro does not depend on the parameters of the specific model chosen for M. The distribution 
function can be written as 



Ga(x) = exp 



wo 
x ' 



^)M(du 

Xk J 



x > 0. 



As the space of all spectral measures is infinite dimensional, there is a need of parametric 
models which are analytically tractable and at the same time flexible enough to approximate 
the dependence structure in real data sufficiently well. Parametric models are usually 
given in terms of their spectral density h of the measure M. The book by |Kotz and| 



Nadarajah (2000) gives an overview of parametric multivariate extreme value distributions, 
most of them, however, being only valid in the bivariate case. For the multivariate case 
only few models are known, e.g., the logistic distribution and its extensions (Joe 



Tawn 19901 and the Dirichlet distribution Coles and Tawn (1991). The recent interest 



in this topic resulted in new multivariate parametric models (Boldi and Davison (2007) 



1990 



Cooley et al. (2010)) as well as in general construction principles for multivariate spectral 



measures (Ballani and Schlather (2011)). All these approaches have in common that they 



propose models for multivariate max-stable distributions in order to fit data obtained by 
exceedances over a certain threshold or by block maxima. 

Given a para metric model for the spectral density h{ ■ ;i?), we have the analog result as in 
Theorem |3.1 1 for the Frechet case with A — A ro and g : R k+1 ->St,xi-> x/||x| 



satisfies the multiplicative invariance property g(a ■ x) = <?(x), for all a G 
version of ^ for this choice of g and A reads as 



i , which now 
The Frechet 



lim 



Y/||Y||i e -B I Y„ e A 



,/u(n)} 



M(B) 
M(S k ) 



h(Lj;$)du, (21) 



for all B G B(Sk) and u(n), n € N, as in Theorem 3.1 Based on this conditional distribution 
of those Yj for which the sum || Yj||i is large, similarly to ^ we obtain the likelihood 

La to (i?;(ri,wi),...,(r„,a> n )) 

|/o|! t\i 



ex Y[h(u t ;&), 



(22) 



where {(r^Wj) : 1 < i < n} are the pseudo-polar coordinates of {Yi. n : 1 < i < n} as 



in (19) and I is the set of all indices 1 < i < n with Y. Ln £ A n Ju(n). Note that the 
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proportional part in ( 22 1 only holds because the z/-measure of A ro is independent of the 
model parameter 

For the Hiisler-Reiss distribution it is possible to write down the spectral density h( ■ ; A) 
explicitly. 

Proposition 3.5. For any matrix A = (A? „•')„-. € I? t/ie Hiisler-Reiss distribution 

a V 1 >J ' 0<l,j<k 

can be written as 

G(x) = exp < — / max [—,...,— ] /ifu;; A) > , 
I ./s* V^o a;*,/ J 

mf/i spectral density 

1 / 1 



* ( "' A) = • • • ^(2^! det El y 2 eXP ^i" T ^" ) » (23) 
w/iere S = ¥*,(„,...,*) (A) and u = (log ^ + 2A? : 1 < * < fc) T . 



3.2.1. Non-parametric, multivariate case 

Based on the explicit expression for the spectral density of the Hiisler-Reiss distribution 
in ( 23 ) , we define the estimator AgpEc of A as the matrix in T> that maximizes the likelihood 
in (221, i.e., 



Aspec = argmin [ ^ log det **,( ,... t *)(A) + 

In the bivariate case, the spectral density in (23) simplifies to 

1 



fe,(0,...,fe)(A) 



exp 



(log ^ + 2A 
8A 2 



2\2 



2Awgwi(2 7 r) 1 /2 

and the corresponding estimator can be given in explicit form 

1 



A 2 



SPEC 



-1 



1 Ul ieio 



(24) 



(25) 



Note that the estimators (24 1 and (251 have exactly the same form as the maximum like- 
lihood estimators (111 and (13 1, respectively, for the extremal increments. However, the 
specification of the set A differs and so does the choice of extreme data that is plugged in. 



3.2.2. Parametric approach for Brown-Resnick processes 

Analogously to Section |3.1.2| we obtain a parametric estimate of the dependence structure 
of a Brown-Resnick process based on a parametric family of variograms by replacing A on 
the right-hand side of d24) by A(i?) defined in (TTtJ) . This yields 



PEC 



arg min { - 
fee 1 



!ogiA ro (A(i?); (ri,u>i), . . . , (r n , <*>„))} 



(26) 



Estimation of Hiisler-Reiss distributions and Brown-Resnick processes 1 3 



4. Simulation study 

We compare the performance of the different parametric and non-parametric estimation 
procedures of Brown-Resnick processes and Hiisler-Reiss distributions proposed in the pre- 
vious section via a simulation study. 

In the first instance, we consider bivariate data that is in the MDA of the Hiisler- 
Reiss distribution with known dependence parameter A = Ao,i. For simplicity, we simulate 
data from the Hiisler-Reiss distribution itself, which does not mean that the thresholding 
procedure via the set A becomes obsolete. All estimators rely on considering only extremal 
events and hence, there is no obvious advantage over using any other data being in the MDA 
of H\. We compare the estimators A^ LE , Aj^ LE2 , ^Van ^Lan an d a spec fr° m Section [i] for 
different sample sizes n G {500,8000, 100000}. The sequence of thresholds u(n) is chosen 
in such a way that the number of exceedances k(n) increases to oo, but at the same time, 
the corresponding quantile q{n) = 1 — k(n)/n approaches 1, as n — > oo. In addition to 
the new threshold based estimators, we include the classical esti mators, which use block 



maxima, namely the madogram estimator A ma( jo = & 1 (0m&do/2) (Cooley et al. 2006) and 
the ML estimator A^ RMLE of the bivariate Hiisler-Reiss distribution. To model a year of 
(dependent) data, we we choose a block size of 150 which is of order of but less than 365. 
The pseudo-code of the exact simulation setup is the following: 

(a) for A 2 G {k • 0.025 : k = 1, . . . , 30} 

(b) for n G {500, 8000, 100000} 

(c) simulate n bivariate Hiisler-Reiss distributions 
with parameter A 

( d ) for ^ 2 {^MLE: ^MLE2: ^Var! ^mcam ^SPEC ^madoJ ^HRMLe} 

(e) estimate A 2 through A 2 

(f) obtain an estimate of the corresponding extremal 
coefficient 6 through § = 0(A) = 2$(A) 

(g) repeat ([a|)-([f]) 500 times 

Since the finite dimensional margins of a Brown-Resnick process are Hiisler-Reiss dis- 
tributed, we can easily implement step ([a]) by simulating a one-dimensional Brown-Resnick 
process with variogram 'y(h) = \h\ on the interval [0,3]. Since we consider bivariate Hiisler- 
Reiss distributions for different values of A 2 lying on a fine grid, we visualize the estimates 9 
as functions of the true A 2 (Figure]!]). However, it is important to remark that estimation in 
this first part of the study is exclusively based on the bivariate distributions. For each value 
of A 2 , we repeat simulation and estimation 500 times. Figure [l] shows the pointwise mean 
value of the extremal coefficient and the corresponding empirical 95% confidence intervals. 
As expected, in finite samples, all estimators based on multivariate POT methods under- 
estimate the true degree of extremal dependence since they are based on an asymptotic 
distribution with non-zero mean while the simulated data come from a stationary process. 
As the sample size n and the threshold u(n) increase, all estimators approach the true 
value. Among the POT-based estimators, A| PEC seems to be at least as good as the other 
estimators, uniformly for all values of A 2 under consideration. Ay ar performs well for small 
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values of A 2 but is more biased than other estimators for large values of A 2 . The good 
performance of A 2 lcan for large values of A 2 might be due to the fact that it only uses first 
moments of the extremal increments and is hence less sensible to aberration of the finite 
sample distribution from the asymptotic distribution. Compared to the two estimators 
based on block maxima, the POT-based estimators all perform well even for small data 
sets, which is a great advantage for many applications. Moreover, the variances of the 
POT-estimates are generally smaller than those based on block maxima, since more data 
can be used. Finally, note that the POT-based estimation does not exploit the fact, that 
the simulated data in the max-domain of attraction is in fact the max-stable distribution 
itself. The speed of convergence may though differ when using data from other models in 
the MDA. In contrast, X^ nado and Ah RMLE do profit from simulating i.i.d. realizations of 
the max-stable distribution itself since then, the blockwise maxima are exactly Hiisler-Reiss 
distributed and not only an approximation as in the case of real data. 

In the second part of the simulation study we examine the performance of parametric 
estimates of Brown-Resnick processes using the same data as above. While the true vari- 
ogram is j(h) = \h\, we estimate the parameter vector (a, s) for the family of variograms 
Ja,s(h) = \\h/s\\ a , a £ (0, 2], s > 0. We compare the following three estimators: the spectral 



estimator (a, s) SPEC , given by (26) and using the full multivariate density; the composite 

likelihood estimator (a, s) SPEC CL , defined as the maximizcr of the product of all bivariate 
spectral densities, implicitly assuming independence of all tuples of locations; and the least 
squares estimator (a, s) PROJ LS , given by (18) for the Euclidean norm, where Aj^g serves 
as non-parametric input. The estimated values of a and s are compared in the right column 
of Figure [2] The left panel shows the corresponding extremal coefficient functions for a and 
s representing the mean, the 5% sample quantile and the 95% sample quantile from the 500 
repetitions, respectively. 

The estimator (a, s)spec> wm ch incorporates the full multivariate information, perforins 
best both in the sense of minimal bias and minimal variance. Especially estimation of the 
shape parameter of the variogram gains stability when using higher-dimensional densities. 
The projection estimator seems to have the largest bias and the largest variance. The re- 
sults remain very similar if we replace A^j LE by one of the other non-parametric estimators. 
Let us finally remark that all three estimators can be modified by considering only small 
distances for inference. Then, since the approximation error of the asymptotic conditional 
distribution decays for smaller distances, this can substantially improve the accuracy in a 
simulation framework, but might distort the results in real data situations. 
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Fig. 1. Estimated extremal coefficients compared to the true ones of bivariate Husler-Reiss distribu- 
tions. 500 repetitions. Block size for block maxima is 150. Left: 9 vs. A 2 . Right: relative difference of 
6 to the true value of 6. 
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Fig. 2. Parametric fit of Brown-Resnick process. 
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Fig. 3. Left: locations of the 35 meteorological stations. Right: locations of the 25 non-coast stations 
before and after multiplication with the anisotropy matrix V0, c). 




5. Application: Wind speed data 

We apply the above theory of estimating Hiisler-Reiss distributions to wind speed data 
provided by the Royal Netherlands Meteorological Institute. We use the daily maxima of 
wind speeds measured at 35 meteorological stations x±, . . . , x 35 £ X, where the set X c M 2 
denotes the geographical coordinates of the Netherlands. The data cover a 23-year period 
of 8172 days from 1990/01/01 to 2012/05/12. Figure [3] provides an overview of the spatial 
locations of the stations. 



5. 1 . Stationarity assumption with zonal anisotropy 

In the sequel, we use the data to fit a stationary Brown-Resnick process based on the 
parametric family of variograms 7q. !S (/i) = ||/i/s|| a , a G (0,2], s > 0. As mentioned in 
Section |2.2| this subclass of Brown-Resnick processes is a natural choice, since they arise 
as the max-limits of suitably rescaled, stationary Gaussian random fields. The stationarity 
assumption, however, turns out to be unrealistic, since stations close to the coast exhibit 
weaker extremal dependence to neighboring stations than inland stations. This is illustrated 
in the left panel of Figure |4j where the estimated bivariate extremal coefficients based on 
mle °f an stations are compared to those without the coastal stations. Hence, we restrict 
our analysis to the 25 inland stations, say T = {xi, . . . , X25}, when fitting a stationary 
Brown-Resnick process. We therefore need to estimate the shape parameter a and the scale 
parameter s of the corresponding parameter matrix A Q S of the Brown-Resnick process on 
T, given by A QjS = (j a , s (xi - £j)/4) 1 < iJ < 25 . 

While the above class of variograms assumes isotropy of the underlying process, meteoro- 
logical data and particularly wind speed data can be expected to exhibit a main direction 
of spatial dependence. We capture this anisotropy by introducing a transformed space 
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Table 1. Estimation results. The values for the standard deviation are obtained from simulating and 
re-estimating the respective models 100 times. 



estimator 


a 


s 


P 


c 


$PROJ, LS 
$SPEC 
^SPEC, CL 


0.296 (0.0193) 
0.338 (0.0166) 
0.346 (0.0234) 


0.234 (0.0744) 
0.687 (0.1797) 
1.025 (0.4806) 


0.379 (0.532) 
0.456 (0.439) 
0.144 (0.520) 


1.67 (0.1761) 
2.21 (0.1596) 
1.61 (0.1846) 



X = VX (cf . right panel of Figure |3|) , where 

V = V(ft,c)=( c Zf Z£i ) > P e [0, 2n] , c> 0, 



is a rotation and dilution matrix; Blanchct and Davison (2011) recently applied this idea 



to the extremal Gaussian process of | Schlather ( 2002 ) . The new parametric variogram 
model becomes A# = (-/ a ,s{Vxi — Vxj)j^) x<i - <25 , where i) = (a,s,0,c) is the vector of 
parameters. As in the above simulation study, we apply the three estimators 

$PROJ, LS = ar gJ™g ||(*MLE,ij)l<ij<25 - A tf || 2 > ^SPEC, ^SPEC, CL- (27) 

For all estimators, the data is first normalized as described at the beginning of Section [3] 
and the threshold u(n) is chosen in such a way that, out of the 8172 days, all data above the 
97.5%-quantile are labeled as extremal. Note that these numbers coincide with the second 
set of parameters (n,q(n)) in the simulation study. Hence, the middle row of Figure [l] 
provides a rough estimate of the estimation error. 

The estimation results and standard deviations for the parameters (a, s, /3,c) are given 
in Table [T] The middle panel of Figure [4] illustrates the effect of transforming the space via 
the matrix V and displays the fitted extremal coefficient functions for the three estimators 
in p7| ). Moreover, the right panel shows the estimates of pairwise extremal coefficients 
based on A^ LE and the model-independent madogram estimator, where the latter exhibits 
a considerably larger variation. In Figure[5j we illustrate the effect of transforming the space 
via the matrix V(f3,c) on the extremal coefficient function and on a typical realization of 
the corresponding Brown-Resnick process. 

In order to validate the reliability of the estimated model parameters we re-simulate 
data in the MDA of the three fitted Brown-Resnick models. Similarly to the simulation 
study, we use 8172 realizations of the Brown-Resnick process itself (which is clearly in its 
own MDA) for the daily data. As index set, we use the transformed locations V(j3, c)T 
on which the Brown-Resnick process is isotropic. Based on this new data, we apply the 
estimation procedure exactly as for the real data to obtain new estimates for $ and thus 
for the extremal coefficient function. This is repeated 100 times and the results for the 
three different estimators in ( 27 1 are shown in Figure [6] In agreement with the results 



of the simulation study, the multivariate estimator $spec seems to be most reliable since 
the re-estimated extremal coefficient functions are close to the true value of the simulation. 
In contrast, the composite likelihood estimator i?spec, cl significantly underestimates the 
true degree of extremal dependence. This is probably a result of the false assumption of 
independence of bivariate densities which underlies the concept of composite likelihoods. 



Estimation of Husler-Reiss distributions and Brown-Resnick processes 1 9 



Fig. 4. Estimated extremal coefficients based on Am LE against distance between the stations. Left 
panel: original locations with and without coast stations. Middle panel: transformed locations (only 
non-coast stations), extremal coefficient functions corresponding to the parameters in Table [T] are 
included. Right panel: comparison to madogram estimator. 
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Fig. 5. Level lines of the extremal coefficient function and realizations of the fitted Brown-Resnick 
process. Left: Without transformation. Right: After space transformation. 





Fig. 6. Validation of estimation: Fitted extremal coefficient functions for 100 simulations of 8172 
Brown-Resnick processes on the transformed locations according to the estimated parameters. 
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Table 2. Fraction of cases in which the HGsler-Reiss model outperforms the Dirichlet and the 
weighted exponential model. 



number of stations 


k = 3 


k = 4 


k = 5 


fc = 6 fc = 7 




0.10 


0.25 


0.73 


1.00 0.99 


P(£hR > ^wExp) 


0.79 


0.96 


0.93 


1.00 



5.2. Non-stationarity 

In the previous subsection we excluded the 10 coastal stations from fitting the stationary 
Brown-Resnick model since they exhibit a different extremal dependence structure than the 
25 inland stations. Here we fit a multivariate Hiisler-Reiss distribution as extreme value 
model, which does not rely on any stationarity assumption. In particular, for T" being any 
subset of the 35 locations x±, . . . ,2^35 £ X, we estimate the fc(fc — l)/2 parameters of the 
dependence matrix A € ]R fcx ' £ , where fc = \T'\. To this end, we can use any of the three 
newly proposed estimators Avar, Amle and Aspec- While Ay ar is given in explicit form and 
hence computationally very efficient and applicable to arbitrary dimensions, the latter two 
estimators require numerical optimization. Fortunately, the respective likelihood functions 
can be still evaluated much faster than most of the commonly used spectral density models. 
For the ML algorithm, Ay a r and, since the class of Hiisler-Reiss distributions is closed, also 
the lower-dimensional parameter estimates provide reasonable starting values. 

In what follows, we use Ay a r as a starting value for the numerical optimization of Aspec- 
We compare the likelihood values of the Hiisler-Reiss model fit to those of two other para- 



metric models for spectral densities, namely the Dirichlet model ( Coles and Tawn 1991 ) 



and the weighted exponential model (Ballani and Schlather 2011). The comparison is 



based on randomly drawing fc = 3, 4, 5, 6 and 7 out of the 35 stations and fitting all three 
models. This is repeated 100 times. The weighted exponential model seems to fit worst for 
all fc € {3,4,5,6}. Note that numerical optimization for this model involves a rather com- 
plicated likelihood and is extremely time-consuming. This is why the weighted exponential 
model is only included for fc g {3, 4, 5, 6}. The Hiisler-Reiss model seems to outperform the 
Dirichlet model for fc > 5, which is not completely surprising since the Dirichlet model has 
only fc parameters, while the Hiisler-Reiss model has fc(fc — l)/2 parameters encoding the 
extremal dependence. The results are summarized by Figure [7J which shows boxplots of 
the maximum likelihood values for each of the 100 choices of stations, and Table [2] which 
shows the percentage of cases in which the Hiisler-Reiss model outperforms the Dirichlet 
and the weighted exponential model. 



6. Discussion 

This paper presents several new estimators for the Hiisler-Reiss distribution and its spa- 
tial analog, the Brown-Resnick process. The methods are based on asymptotic conditional 
distributions of random variables in the MDA of the Hiisler-Reiss model. Within the frame- 
work of multivariate peaks-over-threshold, it is shown how conditioning on different extreme 
events leads to different estimators. In particular, the concept of extremal increments turns 
out to be fruitful, since for the Hiisler-Reiss model the increments conditioned on a fixed 
component being large are approximately multivariate Gaussian distributed. This enables 
very efficient inference even for high dimensions. The simulation study shows, that the 
proposed estimators perform well, both in terms of bias and variance. Especially for small 
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Fig. 7. Comparison of different spectral density models based on the maximized likelihood. The num- 
bers above the boxes show the average computing time (in seconds) for the numerical maximization. 
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data sets they outperform classical block methods. Moreover, the non-parametric, bivariate 
estimators are a suitable tool for exploratory data analysis (such as distinguishing between 



coastal and inland stations in Section 5.1 ), since they are computationally efficient and yield 
reliable results. 

With regard to spatial extreme value statistics, one of the most promising models is the 
class of Brown- Resnick processes, due to their flexibility in connection with parametric 
families of variograms. The paper provides several methods for parametric fitting of these 
models. Particularly the good performance of the multivariate spectral estimator suggests 
that using higher-dimensional densities better captures the shape of the underlying vari- 
ogram than methods based on bivariate distributions only. Also for multivariate analysis of 
non-stationary data, the Hiisler-Reiss model is shown to be both well fitting and applicable 



in high dimensions due to low computational costs of the estimators (Section 5.2 1. 
While the simulation study in Section [4] already provides some empirical evidence for the 
consistency of the proposed estimators, a deeper analysis of the theoretical properties such 
as speed of convergence is left for future research. The main difficulty is to find appropriate 
assumptions such that the conditional increments converge not only in distribution but also 
in Li or L 2 . 

The idea of including all single extreme events into statistical inference, in connection with 
the concept of conditional increments, might also be applicable to other max-stable models 
such as mixed moving maxima processes. 



Supplementary material 



The raw data for the application can be downloaded from http://www.knmi.nl 
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Appendix 



Proof (of Theorem 3.1). Note that 

3 {s(X n ) G B | X n G A-logu(n)} =p{g(X n + logu(ra)) G S | X„ G A-logu(n)} 

p{x„ G (g- 1 (B)-logu(n))n(A~\ogu{n))} 
p{x„ G A — log u(n.)j 

n/tt(n)P{X-log(n/'u(n)) G g^ 1 {B) n A} 
_ rc/u(ra)P{X - log(ra/u(ra)) G A} ' 



Thus, applying Prop. 5.17 in Resnick (2008), we obtain 



lim P (s(X n ) G B I X„ 64 - logw(n)) 
and the measure Q 9j a is given by 



l i(g- 1 (B)nA) 
H(A) 



Proof (of Theorem |3.3[ ) . (a) The density of the exponent measure fj, of the Husler- 
Reiss distribution (J3j is given by 



n(dx , . . .,dx k ) 



7Z TT. ] ^, 1/9 ex P -o^ 2-1 ^ ) dx ...dx k , x ,...,x k e 
(2tt) 2 jdetEl 1 /^ V ^ 



where y = (x 1 - x Q + 2Aj , . . . ,x k - x Q + 2Ag ) and £ = * fe ,( ,...,fc)(A). For s = 
«fc) G K fe , B s = {y£l fc :!/,< s u i = l,...,k}, note that cr 1 ^) = {x G 



l . 



xq < Si, i = 1, . . . , k}. Thus, for A\ = (0, 



oo x 



obtain 



n(g- L (B s )nA 1 )= I < 



OO fXQ + Sl 

XQ 



xo+s k exp (_l y T S -l y j 



(2tt)t |detS|V2 



dx k . . . dxo 



*M,s(si, ■ • ■ ,Sfe), 



where $m.e is the cumulative distribution function of a fc-variate normal distribution 
with mean M = (— 2A^ , . . . , — 2A^ ) T and covariance matrix £ = ^ k ,(o,...,k)(^)- 
Since jJ.{Ai) — 1 and the family of sets {B s ,s G K fc } is a generator of th e B orel 



cr-algebra S(M' £ ) on R k , the first assertion follows from the proof of Theorem 3.1 
(b) In the bivariate case the density of /j, simplifies to 

6 / y — x\ 

/i(dx,dy) = — (j) ( A+ J dxdy, x, y G M, 
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with A = A ,i. We consider the set A 2 = [-00, 0] c c M 2 and note that n(A 2 ) = 2$(A). 
It thus suffices to compute n(g~ 1 (B t ) n A2) for tel. For t < we have 



fiig-^BAnAz) 
and similarly for t > 0, 



00 /-x+t g— a 



2A 



A + 



y-x 

2A 



dx = $ ( A 



2A 



v(g- 1 (B t )nA 2 )= f i(A 2 )-$ A 



2A 



By the above considerations and the proof of Theorem 3.1 this yields 



lim P 

n— too 



- X w < t > logu(n) or > logu(n)} 



In other words, X^> — X^ conditional on either X^ or being large converges 
in distribution to some random variable Z with density 



9\(t) = 



1 

4A$(A)' 



A- 



f e 



Proof (of Proposition 3.5). By Theorem 1 in Coles and Tawn ( 1991[ ) we can com- 
pute the spectral density h as a derivative of the exponent measure i/(x) = — logC?A(x), 
namely 



Oxq . . . dxk 



\i=0 



Ek ) • • • 3 y^v/c 

i=0 Xi 2^i=0 



A 



Since all but one summands of the exponent measure v vanish, it suffices to evaluate 
d 



dxo ■ ■ ■ dxu 



(-l) fe 



.(y|E)dl/i...di/ fc e-*<fa, (28) 



where </>( ■ |E) is the density function of the A:-dimensional normal distribution with covari- 
ance matrix E. Carrying out this computation yields (23 1. 
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